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ABSTRACT 

A study was conducted to identify poor readers and to 
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comprehension. All children listened to three stories and retold the 
stories under free- and probe-recall conditions. Comparison of 
recalls between the good readers and each of the subgroups of poor 
readers showed that poor readers in two subgroups evidenced reduced 
sensitivity to story structure. The children in these subgroups 
recalled less of the stories overall, recalled less information f.om 
story grammar categories to varying extents, and showed patterns of 
category recall which differed from those of normal readers. Children 
in one of the subgroups also displayed poor perception of causal 
relations across story episode boundaries. These results provide 
evidence of marked heterogeneity in poor readers' story comprehension 
and recall. Certain subgroups of poor readers may have qualitatively 
different problems processing stories, relative to other poor 
readers, which may require a more concerted approach to instruction 
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Abstract 

A study was conducted to identify poor readers and to characterize weaknesses in their knowledge and 
use of story structure in comprehension and recall. Subjects were 80 year-3 children, 20 good readers 
and 60 poor readers. The poor readers were then divided into relatively homogeneous subgroups, using 
measures of language-reading comprehension, according to a numerical classification procedure. This 
procedure helped identify specific weaknesses in their language-reading comprehension. All children 
listened to 3 stories and retold the stories under free- and probe-recall conditions. Comparison of 
recalls between the good readers and each of the subgroups of poor readers showed that poor readers 
in 2 subgroups evidenced reduced sensitivity to story structure. The children in these subgroups recalled 
less of the stories overall, recalled less information from story grammar categories to varying extents, 
and showed patterns of category recall which differed from those of normal readers. Children in one 
of the subgroups also displayed poor perception of causal relations across story episode boundaries. 
These results provide evidence of marked heterogeneity in poor readers' story comprehension and recall. 
Certain subgroups of poor readers may have qualitatively different problems processing stories, relative 
to other poor readers, which may require a more concerted approach to instruction in story structure. 
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INDIVIDUAL DIFFERENCES IN 
STORY COMPREHENSION AND 
RECALL OF POOR READERS 



A common approach to research on children with reading difficulties is to compare the status of groups 
of readers of different ability. The research designs define "good" and "poor" readers according to some 
criterion and then compare the status of these readers on measures of cognitive performance. The goal 
of some of these studies is to identify the underlying cause(s) of children's reading difficulties. Such 
status studies have compared groups of readers on a wide variety of cognitive measures, including 
isolated word recognition, oral reading, vocabulary knowledge, and memory (for reviews, see Aulls, 1981; 
Kleiman, 1985; Lipson & Wixson, 1986). However, to date, such studies have failed to identify 
consistently an area of cognitive performance that is responsible for reading difficulty. 

This inconsistency in findings has been particularly true of studies in the area of story comprehension 
and recall. In these studies, children listen to stories and then perform a comprehension or recall task. 
By comparing the good and poor readers' comprehension and recall, researchers are able to make 
inferences about children's relative sensitivity to aspects of story structure. The rationale for these 
studies is that if good and poor readers differ in sensitivity to story structure, independent of their 
decoding abilities, then deficiency in story schema, or failure to use story schema, may be responsible 
for some of the difficulties experienced by poor readers. Results have shown quantitative differences 
between readers of different ability-poor readers comprehend and recall less of a story than do good 
readers (although even here there is some inconsistency)--but results have been inconclusive as to 
whether there are qualitative differences in comprehension and recall. Thus, it is not yet clear whether 
good and poor readers differ in sensitivity to story structure. 

Studies have employed a variety of methods to assess sensitivity to story structure. One group of studies 
has investigated students' awareness of relative importance of idea units in stories. Levels of importance 
of idea units have been empirically defined using Johnson's (1970) technique. This approach is largely 
atheoreticsl with respect to the role of idea units in story comprehension and recall. As such, the 
studies do not necessarily implicate story-specific knowledge. Results have been mixed. On the one 
hand, Smiley, Oakley, Worthen, Campione, and Brown (1977) and Wong (1979) found that poor readers 
recalled fewer idea units than did good readers and that they were sensitive to fewer gradations of 
structural importance than were good readers. On the other hand, Luftig and Greeson (1983) found 
no differences in sensitivity to gradations of importance between educable mentally retarded and normal 
children, and Worden and Nakamura (1982) found no such differences between learning-disabled and 
normal college students. Worden and Nakamura found no differences even in overall recall of the 2 
groups (though these results may have been due to students' repeated exposure to the stories). 

Another group of studies has investigated students' sensitivity to story structure using more theoretically 
motivated proposition^ analyses. Hansen (1978), using Kintsch's (1974) propositional model, found that 
learning-disabled children recalled fewer propositions overall and recalled fewer superordinate 
propositions than did normal children. The 2 groups did not differ in recall of subordinate propositions 
(see also Weisberg, 1979). By contrast, Feagans and Short (1984) parsed their stories in terms of "action 
units" and failed to find any major differences between reading-disabled and normal children in 
comprehension and recall of these units. Wolman (1991), using Trabasso, Secco, and van den Broek's 
(1984) causal network analysis, found that children with mild disabilities recalled less than did children 
without disabilities but, again, failed to find differences between the groups in sensitivity to causal 
connections in stories. 

By far the largest group of studies has investigated students' sensitivity to story grammar categories. 
Story grammars are analytical tools that describe the structural components of narrative text (Mandler 
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& Johnson, 1977; Rumelhart, 1975; Stein & Glenn, 1979; Thorndyke, 1977). Although story grammars 
differ in detail, the categories of information are essentially the same. Stein and Glenn's (1979) well- 
known grammar proposes that a story consists of a major setting (main character), minor setting (time 
and place) followed by one or more episodes. The episodes oomprise 6 categories: initiating events, 
internal responses, internal plans, attempts, direct consequences, and reactions. Sto*y grammars are 
assumed to approximate the readers' (or listeners') cognitive schema that guide the encoding and 
retrieval of story information. As such, studies of readers' sensitivity to grammatical categories more 
closely implicate story-specific knowledge. 

Results of studies comparing good and poor readers have again been equivocal. A classic study by 
Weaver and Dickinson (1982) (see also Weaver, 1978; Weaver & Dickinson, 1979) examined the story 
recall of 10- and 13-year-old "dyslexic" boys and compared their results with those from Stem and 
Glenn's (1979) normal grade-5 readers using the Stem and Glenn grammar. They found no differences 
between the younger disabled and normal readers in overall recall (comparison with the older disabled 
readers was not made because of the age difference) and only 2 differences in recall of information 
within categories— the disabled readers recalled somewhat less of the character's thoughts or feelings 
about the outcome (the reaction category) and of the story context (minor setting). Moreover, they 
found only minor differences in the pattern of recall of story grammar categories (the rank order of 
recall of the categories). 

A number of other studies comparing good readers with various categories of poor readers have 
reported similar results, finding little evidence of differential sensitivity to story grammar categories 
(Backman, Lundberg, Nilsson, & Ohlsson, 1984; McConaughy, 1985; Summers, 1980; Worden, 
Malmgren, & Gabourie, 1982; see also Gold, 1983). These results stand in marked contrast to those 
of Fitzgerald (1984), Hinchley and Levy (1988) and Rahman and Bisanz (1986) who reported having 
identified poor readers who demonstrated reduced sensitivity to story grammar categories (see also 
Barnhart, 1990). 

A fundamental problem with the above studies, and with status studies generally, that may account for 
the inconsistency in findings, is heterogeneity in the samples of poor readers. This was suggested by 
Wiener and Cromer (1967) and elaborated by Applebee (1971), Elkins (1978), Kleiman (1985), and 
Singer (1982). Most studies examining the performance of poor readers, relative to that of good 
readers, have ignored individual differences and assumed that the poor readers constitute a 
homogeneous group. However, Applebee and others pointed out that there may be considerable 
heterogeneity of reading difficulties in samples of poor readers. They argued that if there were relatively 
homogeneous subgroups within the poor reader sample, and these subgroups were ignored by .".veraging 
across the subgroups, then differences between the good and poor readers may be obscured. 
Comparison of the poor readers' performance with that of good readers may reveal no differences or 
differences that were unstable (i.e., sample specific) and that did not apply to any one subgroup (see 
also Backman, Mamen, & Ferguson, 1984; Harris, 1978-1979; Lipson & Wixson, 1986). 

Note that this argument posits the existence of systematic individual differences. Individuals naturally 
differ from each other in a variety of ways. The existence of systematic differences suggests that there 
are similarities as well as differences in the ways students perform and that it is possible to distinguish 
subgroups reflecting the systematic rather than random component of variation between students. 
Students in any given subgroup should share a pattern of performance on variables that defines that 
subgroup and distinguishes its members from those of other subgroups (Applebee, 1971; Kareev, 1982). 

There are indications that the problem of heterogeneity in samples of poor readers is implicated in story 
comprehension research. Weaver and Dickinson noted large variation in performance within their group 
of dyslexic students (see especially Weaver & Dickinson, 1979). Indeed, when they di' : ded their 
dyslexics into subgroups based on verbal-performance IQ discrepancy scores, the few significant 
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differences obtained in their dyslexic-normal comparison seem to have been due to only one subgroup 
of disabled readers (the less verbaby proficient) and the non-significant finding for total recall obtained 
only for comparison of the normal readers with another subgroup (the more verbally proficient). 
Hinckley and Levy (1988) also obtained results that suggest that there may be a large individual 
difference component in story comprehension and recall and that only some poor readers have deficits 
in story-structure knowledge. 

Another problem that may account for the inconsistency in findings is that the measures employed may 
not have assessed relevant aspects of students' knowledge or use of story structure. Story grammars 
specify not only certain categories of story information but also the relations among the categories. Stein 
(1982) has argued that differences between good and poor readers may be found only if students are 
required to perform tasks that deal with the relational properties of stories. Indeed, studies that have 
used tasks requiring students to deal with the relational properties of stories, by having them anticipate 
upcoming story information or recall stories that deviate from the canonical form prescribed by a 
grammar, have found significant differences between good and poor readers in sensitivity to story 
structure (Fitzgerald, 1984; Hinchley & Levy, 1988; Rahman & Bisanz, 1986). 

The purpose of the present study was to identify poor readers who show weaknesses in their knowledge 
and use of story structure. The study sought to address the 2 problems described above. First, to take 
into account individual differences among poor readers, we identified homogeneous subgroups within 
the poor reader sample by numerically classifying the children on the basis of the component structure 
of reading comprehension ability. There is ample evidence that poor readers can be grouped into 
dis tin gui sh able subgroups (e.g., Carr, Brown, Vavrus, & Evans, 1990; Doehring, Trites, Patel, & 
Fiedorowicz, 1981; Lovett, 1984; Torgesen, 1982), although it is uncertain whether knowledge of story 
structure relates to any of the groupings. We reasoned that if any deficits in story-specific knowledge 
could be found, they would obtain for only some subgroup(s) of poor readers. 

We used the numerical classification procedure as a device to address the problem of heterogeneity in 
our sample of poor readers. Using this approach, we hoped to be able to separate systematic individual 
differences from differences due to random error and, thereby, detect ability-group differences in story 
comprehension and recall that heretofore may have gone unnoticed. Our point is that previous story 
grammar studies may have not only overlooked some interesting findings about differences within the 
group of poor readers, they may have also lumped the variance associated with these differences 
together with error variance, thus decreasing the power of tests of differences between groups of good 
and poor readers. 

Second, to assess children's perception of relations among story information, in addition to assessing 
children's free recall of stories, we examined their probe recall of causal relations. The probes were 
designed to provide a more structure-dependent measure of recall. We decided to focus on causal 
relations as these have been found to be an important determinant of reproduction probability in 
summarization tasks (Graesser, 1981; Lehnert, Black, & Resier, 1981) and have received attention in 
recent systems of text analysis (Trabasso & van den Broek, 1985). Probe questions were constructed 
for each story and targeted at either inter- or intra-episodic causal relations. Higher order probes were 
also used so that the extent of a causal relation perceived by a child could be assessed. 

METHOD 

Sample 

Eighty children were selected from an initial pool of 204 children attending year-3 classes in 4 schools 
in a lower middle class area of Brisbane. The children were selected according to their scores on 3 
measures of reading ability administered at the beginning of the school year: '.he Southgate Word 
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Selection and Sentence Completion Tests (Form A) (Southgate, 19S9) and teacher ratings of reading 
ability measured on a 5-point Likert scale. Measures of passage comprehension were not used in this 
initial selection, so as not to bias the sample in favour of children with strengths or weaknesses in story 
comprehension ability. A principal factor analysis of scores on the 3 measures for the 204 children 
yielded one factor that accounted for 74.4% of the variance and showed loadings of .79, .93, and .86, 
respectively. The 20 children with estimated factor scores above the 90th percentile were selected as 
"good readers" (6 boys and 14 girls), and the 60 children with scores between the 10th and 40th 
percentiles were selected as "poor readers" (28 boys and 32 girls). The reading ability of children below 
the 10th percentile was judged to be insufficient to cope with task requirements later in the study. 

Materials 

Classification measures. To classify the poor readers into subgroups, test measures were used to assess 
a number of components of reading comprehension. Based on a factor analytic study of the reading test 
performance of Australian primary school children (Spearritt, 1977), the domain of reading 
comprehension was defined in terms of 4 components: vocabulary knowledge, reading speed, sentence 
comprehension, and passage comprehension. In addition, because of the young age of the children, tests 
of oral language ability were included (see Aaron, 1980; Elkins, 1978). 

The measures used were: the Auditory Association, Grammatic Closure, and Sound Blending subtests 
of the Illinois Test of Psycholinguistic Abilities (ITPA) (Kirk, McCarthy, & Kirk, 1968), selected to 
approximate oral language assessment at the semantic, syntactic, and phonological levels respectively; 
the vocabulary subtest of the Progressive Achievement Test (Elley & Reid, 1969); a test of reading 
speed constructed by the authors, which comprised 3 passages of grade-appropriate difficulty from a 
reading program not used by the participating schools (Hart, Walker, Gray, & Walker, 1977; Walker, 
Walker, & Hart, 1979); and measures of sentence and passage comprehension, also constructed by the 
authors, which were derived from a cloze version of a 308-word story from the same reading program 
(deleting every seventh word for a total of 40 deletions). 

Stories. For purposes of comparing results with previous studies, 3 of the 4 stories employed by Weaver 
and Dickinson (1982) and Stein and Glenn (1979) were used: "Epaminondas," The Tiger's Whisker," 
and "The Fox and Bear." These stories have a readability ranging from early grade-2 to early grade-3 
level (using Spache's 1974 formula). The stories had been parsed according to the Stem and Glenn 
grammar and showed little departure from canonical form. For purposes of illustration, The Fox and 
Bear" and corresponding tree structure for one of its episodes are shown in Figure 1. 

[Insert Figure 1 about here.] 

Probe questions. Five or 6 probe questions were constructed for each story. The questions were 
written to assess children's comprehension of both inter- and intra-episodic causal relations. Questions 
concerning inter-episodic relations measured perceived causality between 2 episodes. Questions 
concerning intra-episodic relations measured perceived causality between statements within an episode. 
The latter contained statements from the internal response, attempt, direct consequence, and reaction 
categories. Setting and initiating event questions were not used as these categories have no causal 
referent according to the Stem and Glenn (1979) grammar. Internal plan questions were also not ased 
as this category occurred in only one story. 

Higher order probes were also constructed to assess the extent to which a child could retrace steps in 
a causal sequence. Use of the higher order probes was conditional on a child's correct response to the 
initial question and the probes followed the network of categories and causal relations postulated by the 
hierarchical structure of each story. An example from The Fox and Bear" is as follows {E - 
experimenter, C = child; numbers in parentheses correspond to statements in the story): 
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E (initial question)'. Why did the fox want to run out of the henhouse? (21) 
C: Because he was frightened. (19) 
E: Why was the fox frightened? 



C: Because he heard a noise. (18) 
E: What made the noise? 



C: The bear on the roof made the roof crack. (17) 

Initial questions were chosen to minimize interdependencies among the questions and yet maximize the 
opportunities for higher order responses. All questions were piloted with a sample of year 3 children. 
Questions were phrased as WHY questions whenever possible; WHAT or HOW questions were asked 
occasionally. 

Procedure 



Following initial selection of the children, the study was conducted in 2 phases. First, the tests were 
administered to the 60 poor readers for the purpose of classifying these readers into subgroups. 
Children were tested individually on the tests of oral language and reading speed and in small groups 
on the tests of vocabulary knowledge and sentence/passage comprehension. For the test of reading 
speed, children read the 3 passages orally and were asked to indicate which passage they liked best. The 
latter device was used to encourage the children to read the passages for meaning. For the test of 
sentence/passage comprehension, the cloze test was untuned and every assistance was given to children 
in decoding words and spelling answers. 

Second, all children, both good and poor readers, were asked to listen to each story and orally retell it. 
Standard recall procedures were used. Children were interviewed individually and told that they would 
be asked to retell each story exactly as they heard it, and that they would have to answer questions about 
the story. They then listened to the first story read by the experimenter. Immediately following the 
presentation, they were asked to count to 50 by 3s, after which they retold the story and answered probe 
questions. This procedure was repeated for the second and third stories, with brief rest periods between 
each story. Order of presentation of the stories was randomized, as was order of presentation of the 
initial questions for each story. All recalls were tape-recorded and later transcribed. 

Scoring 

Scores on the Auditory Association, Grammatic Closure, and Sound Blending subtests were retained 
as raw scores, and results of the reading speed measure were expressed in words per second averaged 
over the 3 passages. Three variables were obtained from the cloze test: the proportion of blanks 
attempted that were answered with exact-replacements, the proportion of not-exact-replacements that 
were contextually (syntactically and semantically) acceptable within the sentence, and the proportion of 
not-exact-replacements that were contextually acceptable with prior sentence context only (blanks 
beginning a sentence did not figure Lato calculation of this variable). The exact-replacement score was 
intended to reflect comprehension at the passage level, and the latter 2 scores, comprehension at the 
sentence level. On a random sample of 100 replacements, interrater agreement on judgments of 
contextual acceptability was 90%. 

Scoring of the free and probe recalls was undertaken without knowledge of children's reading ability 
(good or poor) and of their performance on the classification measures. Free recalls were scored for 
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the number of statements within story grammar categories that were accurately recalled, using gist 
criteria. On a random sample of 24 protocols, interscorer agreement in coding statements according 
to the grammar was 93%. Probe recalls were, scored for the maximum "height" attained in a causal 
sequence. An incorrect response to the initial probe was given a score of 0, a correct response to the 
initial probe was given a score of 1, a correct response to the second probe was given a score of 2, and 
so on. If a child gave a correct response one or more steps removed from the immediate answer, credit 
was given for the "skipped" information. Height of response was expressed as a proportion of the 
maximum height possible. The maximum height was 2 for questions concerning inter-episodic relations 
and 3 for questions concerning intra-episodic relations. No formal estimate of reliability was obtained 
hi scoring the probe recalls. Instead, a second scorer was consulted for a consensus judgment when 
there was doubt about a child's response. 



Numerical Classification 

The 60 poor readers were classified into subgroups using Ward's (1963) minimum variance method of 
hierarchical agglomerative clustering. The proportions resulting from the cloze test were first 
normalized using an arcs in transformation; all 8 classification variables were then standardized for the 
cluster analysis. The procedure started with each child as a separate cluster and successively combined 
clusters that were most similar using squared euclidean distance as the dissimilarity criterion. The result 
was a set of groupings having similar profiles on the classification variables. The performance of the 
poor readers was best described by a 7-cluster solution, as indicated by a marked discontinuity in the 
dissimilarity coefficient in the transition from a 7- to a 6-cluster solution (cf, the scree test in factor 
analysis). We chose to discard 2 children as outliers, and to place a cluster of 4 children with another 
cluster, because they showed highly similar profile.* and combined later in the clustering sequence. The 
final result was a set of 5 subgroups for a classification of 58 (97%) of the poor readers. A similar set 
of subgroups was obtained for a classification of 52 (87%) of these poor readers using an alternative 
clustering algorithm (Johnson's 1967 maximum method). 

The standard score profdes of the subgroups are shown in Figure 2. The triangled data points indicate 
small within-group variance relative to total variance, so they are most diagnostic for interpreting 
subgroup characteristics. Subgroup 1 showed average performance on the sentence comprehension 
variables, but children had difficulty achieving coherence at the passage level. This subgroup also 
showed low scores on the Auditory Association and vocabulary tests that suggest a lack of world 
knowledge. Subgroup 2 was marked by the contrast between relatively good sound blending ability and 
relatively slow reading speed, which suggests an over-reliance on within-word and phonic cues. 
Subgroup 3 showed above-average performance on almost all variables, suggesting that these readers 
had learned to integrate cues at the word, sentence, and passage levels. Subgroup 4 showed extremely 
poor comprehension at the sentence level and low performance on the Auditory Association test. 
Subgroup 5 was marked by extremely poor performance on all but the oral language variables. 
Although small (n = 3), this subgroup proved particularly robust in the cluster analyses. 



There are various criteria for judging the adequacy of a classification. Ideally, the subgroup profiles 
should be validated on an independent sample. In place of this, we performed a discriminant analysis 
to confirm that ihe subgroups were qualitatively distinct. Great care should be taken in interpreting the 
significance levels, because the subgroups have already been made different on the basis of the 
classification variables. However, if the discriminant analysis failed to show separation at conventional 
significance levels (a = .05), there would have been little point in retaining the classification. 



Results 



[Insert Figure 2 about here.] 
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The discriminant analysis indicated that 3 significant discriminant functions explained 59%, 23%, and 
10% of the variance respectively (Wilks' A = .04, F(32,171) = 7.23, p < .05). Table 1 shows the 
discriminant function loadings. Discriminant Function I loaded substantially on all variables except 
Sound Blending and Prior Sentence. Discriminant Function II loaded positively on all oral language 
variables, especially Sound Blending, and negatively on all written language variables, especially reading 
speed. Discriminant Function m was defined by positive loadings on Grammatic Closure and Prior 
Sentence and a negative loading on Sentence. The discriminant scores for the 58 children and the group 
centroids were plotted on the first 2 discriminant functions (Figure 3). Subgroups 1 and 3 showed low 
and high respectively on the first discriminant function and Subgroup 5 was high on the second. 
Variability in Subgroup 2 was accounted for by the second function. Subgroups 1, 2, 3, and 5 were 
distinct, but there was some overlap between Subgroups 1 and 4 (all subgroups were distinct in 3 
dimensions). 

[Insert Table 1 about here.] 

[Insert Figure 3 about here,] 

As a further check on the classification, we performed a weighted-means ANOVA of the 5 subgroups 
using children's principal factor scores from the 3 initial measures of reading ability as the dependent 
variable. This would indicate whether the subgroups reflected nothing more than differences in reading 
ability. The ANOVA revealed a significant difference among subgroups, F(4£3>) = 4.22, p < .05, but 
post hoc Scheffe' tests showed that the difference was due entirely to Subgroup 5 scoring lower than 
Subgroup 3, F(l,53) = 13.73, p < .05. Thus, differences in reading ability alone could not account for 
the classification. 1 

Story Recall 

Approach to analysis. The research question required pairwise comparison of the good readers with 
each of the subgroups of poor readers on recall variables. Accordingly, we performed univariate and 
multivariate planned comparisons on the Group factor. To reduce the opportunity for Type-I error, the 
5 pairwise comparisons were computed only if the omnibus F was significant (cf, Fisher's "protected t" 
strategy for a posteriori comparisons). Because there was little interest in the Story factor, and no 
significant Group x Story interactions were found, results were collapsed over the 3 stories. Unless 
otherwise noted, results reported are for the untransformed scores. Analyses were also performed of 
normalized scores, using an arcsin transformation, but they produced the same results in all but one 
case. The nominal alpha was .05. 

Free recall. The mean proportions of statements recalled by each group of readers from each story as 
a whole are shown in Table 2. On average, the good readers recalled 53% of each story compared with 
40% by the poor readers. The low performance of the poor readers was by no means uniform, however. 
Univariate planned comparisons (following a significant omnibus F) showed that their lower recall was 
due largely to Subgroup 1, F(l,72) = 19.71, p < .05. The next lowest recall was shown by Subgroup 4, 
but the difference relative to the good readers was not significant, F( 1,72) = 3.68, p > .05. There was 
a statistically significant difference between the good readers and Subgroup 2, F(l,72) = 5.83, p < .05, 
but this was probably due to the enhanced power of the F-test for this effect (n = 20 for Subgroup 2). 

[Insert Table 2 about here.] 

The mean proportions of statements recalled within each story grammar category are also shown in 
Tabic 2. These were analyzed in a series of multivariate planned comparisons with 7 dependent 
variables, one for each story grammar category (internal plan statements were included in the internal 
response category). The results largely paralleled those for total story recall. There were significant 
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differences between the good readers and each of Subgroup 1 (multivariate F(7,66) = 4.84, p < .05) and 
Subgroup 4 (multivariate F(7,66) = 2.56, p < .05). 

Following procedures recommended by Pedhazur (1982) and Stevens (1972), we performed post hoc 
discriminant analyses to determine which categories contributed most to these differences. Correlations 
of variables with the discriminant functions are shown in Table 3. The discriminant function separating 
the good readers and Subgroup 1 showed that all types of story information contributed to the 
difference, especially internal response and initiating event. The discriminant function separating the good 
readers and Subgroup 4 showed a different pattern; children in Subgroup 4 had most difficulty recalling 
information from the major setting and internal response categories whereas initiating event and direct 
consequence information did not contribute to the difference. 

[Insert Table 3 about here.] 

None of these results conclusively implicates deficits in story-specific knowledge. The depressed 
performance of Subgroups 1 and 4 might reflect simply poorer verbal memory in a global sense. 
Therefore, to minimize the effects of any general memory differences, we performed multivariate 
planned comparisons using analysis of covariance in which we covaried wi thin-category recall on total 
recall (tests of parallelism of regression planes were satisfied in all cases). This approach most likely 
overcompensates for differences in total recall because it confounds treatment (i.e., the groupings) with 
the covariate and because of the positive within- and between-subjects correlations between the 
dependent variables and covariate. Our reasoning was that if any between-group differences remained 
after covarying on tGtal recall, this would be fairly strong evidence in favour of story-specific effects. 
Results again paralleled our earlier findings but only for the normalized scores. There were significant 
differences between the good readers and each of Subgroup 1 (multivariate F(7,65) = 235, p < .05) and 
Subgroup 4 (multivariate F(7,65) = 2.15, p < .05). No significant differences were obtained with the 
untransformed scores, although the pattern of adjusted means was the same. 

A further test of story-specific effects was obtained by inspecting the rank orders of category recall (see 
Table 2). The order of recall shown by the good readers and each of Subgroups 2, 3, and 5 is similar 
to that reported by Stein and Glenn (1979) for normal readers. Indeed, the orderings shown by the 
good readers and Subgroup 5 are identical to those of Stein and Glenn. However, the patterns shown 
by Subgroups 1 and 4 depart substantially from the typical ordering. Subgroup 1 shows an inversion of 
reaction and minor setting (adjacent categories); recall of statements from these 2 categories was almost 
equivalent. Subgroup 4 shows an inversion of the major setting and inidating event categories, an even 
more substantial reversal. It is interesting to note that major setting, the category that is typically most 
well-remembered, ranked relatively low in the recalls of children in Subgroup 4. 

Probe recall. Separate analyses were performed for questions concerning inter-episodic causal relations 
and questions concerning intra-cpisodic causal relations. In each case, height of response in a causal 
sequence (expressed as a proportion of the maximum height possible) was averaged over those questions 
of a given grammatical type. There were 3 initial probes for inter-episodic relations, and 3 or 4 initial 
probes for each of 4 categories for intra-episodic relations. We chose to conditionalize height of recall 
on the accuracy of response by averaging over only those initial probe questions that were correctly 
answered. Otherwise, height of recall would have been confounded with response accuracy, because a 
child who provided only the immediately prior causal referent for most questions (i.e.. scoring low in 
the sequence) would receive a similar score to that of a child who retrieved information further back 
from the probe (i.e., scoring high in the sequence) but on only 1 or 2 questions. Necessarily, children 
scoring 0 on all questions of a given type were excluded from the analysis, resulting in a reduced sample 
size. 
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For questions concerning inter-episodic causal relations, univariate planned comparisons showed that 
there was a significant difference in mean height of probe recall between the good readers (mean = .74) 
and Subgroup 1 (mean = .56) (F(l,71) = 10.84, p < .05). There were no significant differences between 
the good readers and each of Subgroups 23,4, and 5. The means for these subgroups were in fact highly 
uniform (means = .73, .73, .69, .69, respectively). 

For questions concerning intra-episodic causal relations, there were 4 categories: internal response, 
attempt, direct consequence, and reaction. Because of the conditional metric, sample sizes varied and the 
omnibus tests were performed using separate weighted-means ANOVAs for each category of question. 
There were no significant differences among groups on any category and planned comparisons were not 
pursued. Table 4 shows that the . jean heights of probe recall in categories within a story episode were 
remarkably similar across groups. 



The purpose of the present study was to identify poor readers who show weaknesses in their knowledge 
and use of story structure. Using a traditional story recall paradigm, combined with numerical 
classification procedures, we were able to show that only some poor readers evidenced reduced 
sensitivity to the structure of stories, namely those represented by Subgroups 1 and 4. Children in 
Subgroup 1 (n = 13) performed poorly on the Auditory Association and vocabulary tests and on the 
passage-level cloze task. Children in Subgroup 4 (n =7) performed poorly on the Auditory Association 
test and on the sentence-level cloze task. The diagnostic profiles of both subgroups reveal difficulties 
in integrating information from a story at a macro-level. These findings provide evidence of marked 
heterogeneity in poor readers' story comprehension and recall, and they suggest that deficiency in story 
schema or failure to use story schema may be responsible for the reading problems experienced by some 
poor readers. 

The finding of significant and meaningful differences between good and poor readers in sensitivity to 
story grammar categories is at variance with the findings of Backman et al. (1984), McConaughy (1985), 
Summers (1980), Weaver and Dickinson (1982), and Worden et al. (1982). Our results are, however, 
consistent with those of Fitzgerald (1984), Hinchley and Levy (1988) and Rahman and Bisanz (1986). 

Using measures of probe recall, we were also able to show that readers in one of the subgroups 
(Subgroup 1) had difficulty retrieving information A/gft in a causal sequence, but only when the sequence 
crossed story episode boundaries. Height of response to questions about causal relations within story 
episodes was not a source of variability for these poor readers. This finding seems in agreement with 
the evidence concerning the constituent structure of story grammar models. Previous studies have found 
evidence of longer processing time at episode boundaries than within episodes (Haberlandt, 1980; 
Haberlandt, Berian, & Sandson, 1984; Mandler & Goodman, 1982; Thorndyke, 1978; Yekovich & 
Thorndyke, 1981). Boundary effects for constituents within the episode are less common, although some 
effects for story grammar categories have been reported (Mandler & Goodman, 1982). 

The. finding of no such trend in the probe recall for Subgroup 4 is interesting. The difference between 
Subgroups 1 and 4 in probe but not free recall may lie in the nature of the recall tasks. Organization 
of retrieval is important in free recall and less so in probe recall where the retrieval cues are provided. 
Hence, children in Subgroup 4 may have had little difficulty encoding and storing story information, but 
may have had trouble accessing it in free recall. Children in Subgroup 1, on the other hand, may have 
had difficulties in both encoding and storage as well as retrieval. 



[Insert Table 4 about here.] 
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Despite the refinements employed in the present study, there are still a number of conceptual problems 
attendant upon the design. Although the study examined readers' sensitivity to story grammar 
categories, the evidence is not conclusive as to whether story-specific deficits are implicated in the 
reduced performances of poor readers in Subgroups 1 and 4. Children might have had a poorer verbal 
memory in a global sense. We believe that 4 outcomes from the study point to there being story-specific 
deficits. First, results of the discriminant analyses showed that not aU story grammar categories were 
equally at risk for children in the 2 subgroups. Noteworthy was the finding that information concerning 
the protagonist's goals, plans, and thoughts (the internal response/plan categories) presented major 
difficulties for both subgroups. Second, children in Subgroups 1 and 4 showed substantial departures 
from the usual pattern of category recall, whereas the patterns shown by all other children were 
consistent with that predicted by the Stein and Glenn (1979) grammar. Third, even when we tried to 
control for memory differences using overall recall as a covariate, we failed to eliminate all between- 
group differences in category recall (at least for normalized scores). Fourth, the results from the probe 
recall for Subgroup 1 revealed differential retrieval of story information within and between episodes, 
a finding that previous research suggests is consistent with story-specific effects (e.g., Haberlandt, 1980). 

Another problem is that the evidence relating weaknesses in story knowledge to differences in reading 
ability arising from the present study is correlational and no causal connection can be inferred. Several 
interpretations are possible. As previous studies have suggested, lack of story knowledge may be a cause 
of the reading difficulties. Alternatively, it is possible that lack of story knowledge may be a 
consequence of the reading difficulties. Poor readers may read less, experience less exposure to stories, 
and thus fail to develop adequate knowledge of the structure of stories. Indeed, several researchers have 
speculated that memory differences between good and poor readers might be a consequence of 
differential reading practice (Bjorklund & Bernholtz, 1986; Torgeson, 1985). Also likely is that story 
knowledge and reading difficulty are reciprocally related. Still another interpretation is that both story 
knowledge and reading ability may be related to a third variable, such as the experience of being read 
to in the home (for useful discussions of the ways in which good-poor reader differences and their 
relationship to reading achievement may be interpreted, see Backman et al., 1984; Kleiman, 1985; 
Stanovich, 1986; Valtin, 1978-1979). 

A third problem is that the present results do not deny the existence of subgroups within the good 
reader population as well as the poor reader population (see Lipson & Wixson, 1986; Singer, 1982). 

Implications of the present results for classroom practice need to be considered carefully. We used the 
numerical classification procedure as a descriptive, rather than predictive, device to address the problem 
of heterogeneity in our sample of poor readers. Using this approach, we were able to detect ability- 
group differences .in story comprehension and recall that heretofore may have gone unnoticed. 
Nevertheless, our findings leave open the possibility that some test measures (especially the cloze tasks) 
may have a predictive, possibly diagnostic, power in the identification of children who lack sensitivity to 
story structure. Our classification tests are very traditional measures and have a history of use in 
schools. 

Most research shows that some form of instruction in story grammar elements promotes children's 
comprehension and recall of stories (for a review of instructional studies, see Baumann & Bergeron, 
1993). How can these results be reconciled with findings from the present study? We believe that for 
many children sensitivity to story structure is a form of tacit knowledge; any instruction that formalizes 
knowledge of story structure should enhance the likelihood that they will use this knowledge when next 
they are asked to demonstrate comprehension and recall of story elements. However, few studies have 
examined the differential response of children to story grammar instruction, beyond using fairly gross 
measures of ability, and few studies have separated out effects of such instruction for children for whom 
knowledge of story structure is developed but underutilised from those for whom it is lacking. The 
implications of the present study are that some poor readers may have qualitatively different problems 
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processing stories, relative to other poor readers, which may require a more concerted approach to 
instruction in story structure. Teachers may wish to select children for story grammar instruction (cf, 
Fitzgerald & Spiegel, 1983) and to include in such instruction not only explicit teaching of story elements 
but also emphasis on the causal relations among the elements, especially between episodes. 
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Footnote 

! Of course, it is still possible that the subgroups reflect developmental differences in the 
acquisition of reading ability, and that these differences are not well described by a single metric. 



Table 1 



Correlations of Variables with Discriminant Functions 



Discriminant Function 

Variable I II III 



Auditory Association 


.45 


52 


-.22 


Grammatic Closure 


58 


55 


.34 


Sound Blending 


.27 


.62 


.19 


Reading Speed 


J52 


-58 


-.05 


Vocabulary 


.81 


-.16 


.08 


Exact Replacement 


.82 


-.19 


-.17 


Sentence 


.54 


-.26 


-.45 


Prior Sentence 


-.05 


-.53 


.69 
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Table 2 



Mean Proportions of Story Information Recalled by Group 



Category 


VJl/UU 

Readers 






StiHirrniin 






1 


2 


3 


4 


5 


Major setting 


.83 


.46 


.60 


.62 


38 


.78 


Direct consequence 


.77 


.51 


.60 


.73 


.65 


.75 


Attempt 


.62 


.33 


.49 


51 


.48 


.60 


Initiating Event 


.54 


.23 


.43 


52 


50 


.46 


Reaction 


.50 


.20 


.40 


35 


.25 


33 


Minor setting 


.43 


.22 


•27 


.25 


.21 


.07 


Internal response 


31 


.09 


21 


.29 


.17 


.21 


Total 


53 


.28 


.41 


.47 


.40 


.48 
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Table 3 

Correlations of Variables with Discriminant Functions 



Good Readers 

vs 



Category Subgroup 1 Subgroup 4 

Major setting .63* .81* 

Direct consequence .57* .19 

Attempt .67* .42 

Initiating Event .76* .09 

Reaction .65* 34* 

Minor setting .45* .55* 

Internal response .81* .58* 



* Significant univariate F-test of difference in recall, p < .05 



Table 4 



Mean Height of Probe Recall of Intra-Episodic Relations by Group 



Category 


Good 
Readers 


1 


2 


Subgroup 
3 


4 


5 


Internal response 


.96 


1.00 


.94 


.99 


.95 


1.00 


Attempt 


.95 


.87 


.94 


.89 


.96 


.75 


Direct consequence 


.86 


.83 


.89 


.88 


.83 


.78 


Reaction 


.91 


.88 


.87 


.95 


.86 


.81 
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Figure Captions 



Figure L "The Fox and Bear" parsed into story grammar categories, Figure lb. tree structure for 
Episode 3 of "The Fox and the Bear." 

Figure 2. Standard score profiles of 5 subgroups of poor readers on eight classification measures 
(AA= Auditory Association, GC=Grammatic Closure, SB=Sound Blending, SPEED=Reading Speed, 
VOCAB = Vocabulary, ER= Exact Replacement, SENT = Acceptable within Sentence, PRIOF 
SENT = Acceptable in Prior Sentence Context). 

Figure 3. Plot of 5 subgroups of poor readers on Discriminant Functions I and II (Subgroups 1, 2, 3, 
5 circled). 



Major Setting 
Minor Setting 
Internal Response 

Minor Setting 
Attempt 

Internal Response 
Attempt 

Internal Response 
Attempt 



3. 
4. 
5. 



There was a fox and a bear 
who were friends 

One day they decided to catch a chicken for supper 

They decided to go together 

because neither one wanted to /be left alone 



6. and they both liked fried chicken 

7. They waited until night time 

8. Then they ran very quickly to a nearby farm 

9. where they knew chickens lived 

10. The bear, who felt very lazy 

11. climbed upon the roof 

1 2. to watch 

13. The fox then opened the door of the henhouse very carefully 

14. He grabbed a chicken 



Direct Consequence 15. and killed it 



Initiating Event 

Internal Response 
Minor Setting 
Internal Response 



16. 
17. 
18. 

19. 

20. 

21. 



Direct Consequence 22. 

23. 
24. 

Attempt 25. 
Internal Response 26. 



As he was carrying it out of the henhouse 

the weight of the bear on the roof caused the roof to crack 

The fox heard the noise 

and was frightened 

but it was too late 

to run out 

The roof and the bear fell in 
killing five of the chickens 

The fox and the bear were trapped in the broken henhouse 
Soon the farmer came out 
to see what was the matter 



Figure la 



Episode 3 



lnitiating--INITIATE--Response 
Event / \ 




Action-AND-Natural CAUSE-lntetnal 
(16) Occunence Event/ v 

(17) (18)/ \ 

I nter nal-M 0 TIVAT E -Plan 
Response Sequence 



Affect CAUSE-Goal 
(19) (21) 




lntemal-MOTIVATE--Plan 

Plan Application 
(omitted) 

Attempt-R ES U LT (S etting-ALLO W-R esolution) 
(omitted) 

State 
(20) 

Direct-INITIATE-Reaction 
Consequence (omitted) 




N atui al-CAU S E-(N atural-AN D -E nd) 
Occurrence Occurrence State 



(22) 



(23) 



(24) 



Figure lb 
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